Closed Bug 1074149 Opened 11 years ago Closed 8 years ago

Allow setting a PR_LOG() module's logging level at runtime

Categories

(NSPR :: NSPR, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: gsvelto, Assigned: wtc)

References

(Blocks 1 open bug)

Details

Currently a logging module will read its current logging level from the NSPR_LOG_MODULES environment variable either when the logging machinery is initialized or when the module is created (see _PR_SetLogModuleLevel()). It would be convenient to be able to set the logging level dynamically at runtime; this would entail a few changes: - Since there's currently no thread-safety measures when accessing a module one would either restrict setting the module level to the thread that created a module or providing at least some measure of safety (e.g. flagging the level field as volatile, locks/atomic accesses shouldn't be required since there's only one field to adjust and there's no memory ordering hazards involved). - Ensuring some kind of priority between _PR_InitLog() and the levels set manually since _PR_InitLog() will walk the list of existing modules and change their levels if NSPR_LOG_MODULES is set.
(In reply to Gabriele Svelto [:gsvelto] from comment #0) > Currently a logging module will read its current logging level from the > NSPR_LOG_MODULES environment variable either when the logging machinery is > initialized or when the module is created (see _PR_SetLogModuleLevel()). > > It would be convenient to be able to set the logging level dynamically at > runtime; this would entail a few changes: > > - Since there's currently no thread-safety measures when accessing a module > one would either restrict setting the module level to the thread that > created a module or providing at least some measure of safety (e.g. flagging > the level field as volatile, locks/atomic accesses shouldn't be required > since there's only one field to adjust and there's no memory ordering > hazards involved). Per discussion on the other bug, even just setting a field from thread A while reading it from thread B can cause B to see a value that is neither the before value nor the after value (i.e. TSAN data race). Blame the people who wrote the compiler specs; they explicitly wrote it to allow maximum optimization by assuming there are no cross-thread accesses possible.
(In reply to Randell Jesup [:jesup] from comment #1) > Per discussion on the other bug, even just setting a field from thread A > while reading it from thread B can cause B to see a value that is neither > the before value nor the after value (i.e. TSAN data race). Blame the > people who wrote the compiler specs; they explicitly wrote it to allow > maximum optimization by assuming there are no cross-thread accesses possible. Let's make this super-safe then even though it will require a little bit more work. Setting the level field shall be done using atomic_store_explicit(*, memory_order_relaxed) or an equivalent built-in and reading it using atomic_load_explicit(*, memory_order_relaxed) or an equivalent compiler-specific built-in. This means we'll have to extend the PR_ATOMIC_* macros to support them and considering the sparse compiler support this will require a fair bit of checking for compiler builtins which is a pain in the neck which I was fortunate enough to extensively explore already. For all other targets we can use PR_ATOMIC_SET() for stores and PR_ATOMIC_ADD(*, 0) for loads. Both of them have a significant cost across all architectures though.
(In reply to Randell Jesup [:jesup] from comment #1) > Per discussion on the other bug, even just setting a field from thread A > while reading it from thread B can cause B to see a value that is neither > the before value nor the after value (i.e. TSAN data race). I don't believe that was ever stated, nor is it true for this case as long as our field is properly aligned. See bug 881389 comment 67 (32-bit reads/writes are "atomic" for the architectures we care about at this point). Lets put things in perspective by looking at other more elaborate logging frameworks: * Apache's log4cxx does not synchronize access * log4cpp does not synchronize access * pantheios does not synchronize access * ACE logging does not synchronize access * Poco logging does not synchronize access * Boost is to much of a pain to figure out, I think it leaves it up to the user
(In reply to Eric Rahm [:erahm] from comment #3) > (In reply to Randell Jesup [:jesup] from comment #1) > > Per discussion on the other bug, even just setting a field from thread A > > while reading it from thread B can cause B to see a value that is neither > > the before value nor the after value (i.e. TSAN data race). > > I don't believe that was ever stated, nor is it true for this case as long > as our field is properly aligned. See bug 881389 comment 67 (32-bit > reads/writes are "atomic" for the architectures we care about at this point). That isn't the issue with TSAN. The problem is the language definition allows the compiler to use your field/variable/etc as spill (or other) storage temporarily between when you tell it to modify the value and some later point where the language spec says the value is now in the destination. So, at the ASM level, if int32_t a == 1, and you run this code: "a = 2", another thread can do "int32_t b = a" and get the value 3. Where did 3 come from? (It's not what you're talking about (a "torn" read/write)). Likely it came from a register spill perhaps between the last point someone cared about the old value of a (previous read or function/block entry) and the point where it has to have the new value there (2). Now, *if* the value subject to a TSAN data race is say a logging level or logging flags, the damage possible is low, and one might ignore it. If the value was a pointer to a logging structure, then you could UAF/wild-pointer. Also, in this very-specific case, you might decide that a possible mis-logging or crash/sec issue when the user changes the log level on the fly is simply "ok" given it's unlikelihood. Note: TSAN as an issue wasn't really understood until a few years ago, and lots of people (including me) dragged their heels with arguments just like yours - but a c/c++ compiler isn't a fancy assembler, AND it's explicitly not an inherently multithread-safe language; the language is made to assume (with limited exceptions) that it's implementing a single-threaded control flow. I don't know current numbers (cdiehl?) but a couple of years ago a 3rd-party ran TSAN against firefox and got 1000 hits loading bbc.co.uk. I know we've made a lot of progress since then. Let's not add to that total... > Lets put things in perspective by looking at other more elaborate logging > frameworks: Many of these were written before TSAN was well understood; and with TSAN even if there are competing threads accessing, *usually* the compiler didn't use your value as a spill. The risk is some future version will without any warning, or when you crank up the opt level, or turn off debug. And in the wild they can be virtually impossible to debug or even realize why once in a blue moon you crash or do something weird (and it gets blamed on "memory corruption", "cosmic rays", etc), or termed a heisenbug or compiler bug because it goes away when you turn off or reduce optimization. Note the examples of real-world problematic codegen in the intel blog post. I don't know what "does not synchronize access" means here; I don't know if they support multi-threaded logging; I don't know if they've ever scanned with TSAN; ever heard of TSAN or the sort of compiler-helped data-races here, etc. > * Apache's log4cxx does not synchronize access > * log4cpp does not synchronize access > * pantheios does not synchronize access > * ACE logging does not synchronize access > * Poco logging does not synchronize access > * Boost is to much of a pain to figure out, I think it leaves it up to the > user http://dslab.epfl.ch/pubs/portend.pdf https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
(In reply to Randell Jesup [:jesup] from comment #4) > That isn't the issue with TSAN. The problem is the language definition > allows the compiler to use your field/variable/etc as spill (or other) > storage temporarily between when you tell it to modify the value and some > later point where the language spec says the value is now in the > destination. Yes, but as stated in the other bug already flagging that field as volatile would be enough to prevent storage reuse nullifying any TSAN-race concern and make the access atomic (*). That being said I've proposed going with full blown atomics in comment 2 because: - It will generate the exact same code as if the field is volatile when using relaxed memory ordering so we won't have any overhead - It should obviously protect us from any compiler mischief, present or future - It will make people feel safer since we're doing the Right Thing™ The downside is obviously more code. *) As per spec since we're talking of a type smaller or equal to the underlying machine word size and properly aligned
In bug 1073578 comment 7 Gabriele suggested adding thread-safe functions for iteration and insertion of log modules, this bug seems like the right place to do that.
We support this on the mozilla logging side, at this point I don't think we care about fixing NSPR.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.